Multi-scale hybrid vision transformer and Sinkhorn tokenizer for sewer defect classification

نویسندگان

چکیده

A crucial part of image classification consists capturing non-local spatial semantics content. This paper describes the multi-scale hybrid vision transformer (MSHViT), an extension classical convolutional neural network (CNN) backbone, for multi-label sewer defect classification. To better model in images, features are aggregated at different scales non-locally through use a lightweight transformer, and smaller set tokens was produced novel Sinkhorn clustering-based tokenizer using distinct cluster centers. The proposed MSHViT were evaluated on Sewer-ML dataset, showing consistent performance improvements up to 2.53 percentage points.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

tight frame approximation for multi-frames and super-frames

در این پایان نامه یک مولد برای چند قاب یا ابر قاب تولید شده تحت عمل نمایش یکانی تصویر برای گروه های شمارش پذیر گسسته بررسی خواهد شد. مثال هایی از این قاب ها چند قاب های گابور، ابرقاب های گابور و قاب هایی برای زیرفضاهای انتقال پایاست. نشان می دهیم که مولد چند قاب تنک نرمال شده (ابرقاب) یکتا وجود دارد به طوری که مینیمم فاصله را از ان دارد. همچنین مسایل مشابه برای قاب های دوگان مطرح شده و برخی ...

15 صفحه اول

Using Evidence Combination for Transformer Defect Diagnosis

This paper describes a number of methods of evidence combination, and their applicability to the domain of transformer defect diagnosis. It explains how evidence combination fits into an on-line and implemented agent-based condition monitoring system, and the benefits of giving selected agents reflective abilities. Reflection has not previously been deployed in an industrial setting, and theore...

متن کامل

Identification of Houseplants Using Neuro-vision Based Multi-stage Classification System

In this paper, we present a machine vision system that was developed on the basis of neural networks to identify twelve houseplants. Image processing system was used to extract 41 features of color, texture and shape from the images taken from front and back of the leaves. The features were fed into the neural network system as the recognition criteria and inputs. Multilayer perceptron (MLP) ne...

متن کامل

Multi-scale embedded descriptor for shape classification

We present a new shape descriptor that are robust to deformation and capture part details. In our framework, the shape descriptor is generated by 1) using running angle to transforming a shape into a 2-D description image in the position and scale space. 2) performing circular wavelet-like sub-band decomposition of this 2-D description image based on its periodic convolution with orthogonal ker...

متن کامل

Learning Multi-scale Representations for Material Classification

The recent progress in sparse coding and deep learning has made unsupervised feature learning methods a strong competitor to hand-crafted descriptors. In computer vision, success stories of learned features have been predominantly reported for object recognition tasks. In this paper, we investigate if and how feature learning can be used for material recognition. We propose two strategies to in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Automation in Construction

سال: 2022

ISSN: ['1872-7891', '0926-5805']

DOI: https://doi.org/10.1016/j.autcon.2022.104614